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Chapter 1 
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Pascal Wei0 

LaBRI, Universite de Bordeaux and CNRS, Bordeaux, France 

This introductory chapter is a tutorial on finite automata. We present the stan- 
dard material on determinization and minimization, as well as an account of the 
equivalence of finite automata and monadic second-order logic. We conclude with 
an introduction to the syntactic monoid, and as an application give a proof of the 
equivalence of first-order definability and aperiodicity. 

1.1. Introduction 
1.1.1. Motivation 

The word automaton (plural: automata) was originally used to refer to devices 
like clocks and watches, as well as mechanical marvels built to resemble moving 
humans and animals, whose internal mechanisms are hidden and which thus appear 
to operate spontaneously. In theoretical computer science, the finite automaton is 
among the simplest models of computation: A device that can be in one of finitely 
many states, and that receives a discrete sequence of inputs from the outside world, 
changing its state accordingly. This is in marked contrast to more general and 
powerful models of computation, such as Turing machines, in which the set of 
global states of the device — the so-called instantaneous descriptions — is infinite. 
A finite automaton is more akin to the control unit of the Turing machine (or, 
for that matter, the control unit of a modern computer processor), in which the 
present state of the unit and the input symbol under the reading head determine 
the next state of the unit, as well as signals to move the reading head left or right 
and to write a symbol on the machine's tape. The crucial distinction is that while 
the Turing machine can record and consult its entire computation history, all the 
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information that a finite automaton can use about the sequence of inputs it has 
seen is represented in its current state. 

But as rudimentary as this computational model may appear, it has a rich 
theory, and many applications. In this introductory chapter, we will present the 
core theory: that of a finite automaton reading a finite word, that is, a finite string of 
inputs, and using the resulting state to decide whether to accept or reject the word. 
The central question motivating our presentation is to determine what properties 
of words can be decided by finite automata. Subsequent chapters will present both 
generalizations of the basic model (to devices that read infinite words, labeled trees, 
etc.) and to applications. An important theme in this chapter, as well as throughout 
the volume, is the close connection between automata and formal logic. 

1.1.2. Plan of the chapter 

In Section 1 1.2 1 we introduce finite automata as devices for recognizing formal lan- 
guages, and show the equivalence of several variants of the basic model, most no- 
tably the equivalence of deterministic and nondeterministic automata. Section 11.31 
describes Biichi's sequential calculus, the framework in predicate logic for describ- 
ing properties of words that are recognizable by finite automata. In Section 11.41 
we prove what might well be described as the two fundamental theorems of finite 
automata: that the languages recognized by finite automata are exactly those de- 
finable by sentences of the sequential calculus, and also exactly those definable by 
rational expressions (also called regular expressions). Section [1.51 presents methods 
that can be used to show certain languages cannot be recognized by finite automata. 
The last sections . 11.61 and 11.71 have a more algebraic flavor: we introduce both the 
minimal automaton and the syntactic monoid of a language, and prove the impor- 
tant McNaughton-Schutzenberger theorem describing the languages definable in the 
first-order fragment of the sequential calculus. 

1.1.3. Notation 

Throughout this chapter, A denotes a finite alphabet, that is, a finite non-empty set. 
Elements of A are called letters, and a finite sequence of letters is called a word. We 
denote words simply by concatenating the letters, so, for example, if A = {a, b, c}, 
then aabacba is a word over A. The empty sequence is considered a word, and we 
use e to denote this sequence. The set of all words over A is denoted A*, and the 
set of all nonempty words is denoted A + . The length of the word w, that is, the 
number of letters in w, is denoted \w\. 

If u, v £ A* then we can form a new word uv by concatenating the two se- 
quences. Concatenation of words is obviously an associative and (unless A has a 
single element) noncommutative operation on A* . We have 

\uv\ = \u\ + \v\, and 
u £ = £ u = u. 
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(Other texts frequently use A or 1 to denote the empty word. The latter choice is 
justified by the second equation above.) 

A subset of A* is called a language over A. 

1.1.4. Historical note and references 

This chapter contains a modern presentation of material that goes back more than 
fifty years. The reader can find other accounts in classic papers and texts: The 
equivalence of finite automata and rational expressions given in Section 11.41 was 
first described by Kleene in [5]. The connection with monadic second-order logic 
was found independently by Trakhtenbrot [23] and Biichi [JJ. 

Nondeterministic automata were introduced by Rabin and Scott [17], who 
showed their equivalence to deterministic automata. Minimization of finite-state 
devices (framed in the language of switching circuits built from relays) is due to 
Huffman [S]. The simple congruential account of minimization that we give origi- 
nates with Myhill [13] and Nerode [14] , 

The equivalence of aperiodicity of the syntactic monoid with star-freeness is due 
to Schiitzenberger [19] , and the connection with first-order logic is from McNaughton 
and Papert [11) . Our account of these results relies heavily on an argument given 
in Wilke [23]. 

Rational expressions, determinization and minimization have become part of the 
basic course of study in theoretical computer science, and as such are described in a 
number of undergraduate textbooks. Hopcroft and Ullman [7J, Lewis and Papadim- 
itriou [10] and the more recent Sipser [20] are notable examples. A more technical 
and algebraically-oriented account is given in the monograph by Eilenberg [4] [5] . 
An algebraic view of automata is developed by Sakarovitch [18] . Detailed accounts 
of the connection between automata, logic and algebra can be found in Straub- 
ing [21] and Thomas [22]. The state of the art, especially concerning the algebraic 
classification of automata, will appear in the forthcoming handbook |16] . 

1.2. Automata and rational expressions 
1.2.1. Operations on languages 

We describe here a collection of basic operations on languages, which will be building 
blocks in the characterization of the expressive power of automata. 

Since languages over A are subsets of A* , we may of course consider the boolean 
operations: union, intersection and complement. The product operation on words 
can be naturally extended to languages: if K and L are languages over A, we define 
their concatenation product KL to be the set of all products of a word in K followed 
by a word in L: 



KL = {uv u £ K and v £ L}. 
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We also use the power notation for languages: if n > 0, L n is the product LL ■ ■ ■ L 
of n copies of L. We let L° — {e}. Note that if n > 1, L n differs from the set of 
n-th powers of the elements of L. The iteration (or Kleene star) of a language L is 
the language L* = \J n >o L "- 

Finally, we introduce a simple rewriting operation, based on the use of mor- 
phisms. If A and B are alphabets, a morphism from A* to B* is a mapping 
ip: A* B* such that 



(2) for all u, v e A*, ip{uv) — <p(u)<p(v). 

To specify such a morphism, it suffices to give the images of the letters of A. 
Then the image of a word u G A*, say u — a\---a n , is obtained by taking 
the concatenation of the images of the letters, ip(u) — ip(ai) • ■ ■ <p(a n ). That is, 
<p(ai ■ ■ -a n ) is obtained from a\ ■■■a n by substituting for each letter en the word 
<p(a,i). This operation naturally extends from words to languages: if L C A* , then 
<p(L) = {<p(u) \ ueL}. 

The consideration of these operations leads to the classical definition of rational 
languages (also called regular languages). The operations of union, concatenation 
and iteration are called the rational operations. A language over alphabet A is called 
rational if it can be obtained from the letters of A by applying (a finite number of) 
rational operations. 

More formally, the class of rational languages over the alphabet A, denoted 
Rat A*, is the least class of languages such that 

(1) the languages and {a} are rational for each letter a 6 A; 

(2) if K and L are rational languages, then K U L, KL and L* are also rational. 



Example 1.1. The language ((a*{ab)*A*nA*(ba)*) J is rational. (Note that in 
order to lighten the notation, we write a, b, etc., instead of {a}, {&}.) 

The language {t~}, containing just the empty word, is rational. Indeed, it is 
equal to 0*. 

Any finite language (that is, containing only finitely many words) is rational. 

Let a, b 6 A be distinct letters. It is instructive to show that the following 
languages are rational: (a) the set of all words which do not contain two consecutive 
a; (b) the set of all words which contain the factor ab but not the factor ba. 

We also consider the extended rational operations: these are the rational op- 
erations, and the operations of intersection, complement and morphic image. A 
language is said to be extended rational if it can be obtained from the letters of A 
by applying (a finite number of) extended rational operations. The class of extended 
rational languages over A is written X-RatA*. 

Of course, all rational languages are extended rational. The definition of ex- 
tended rational languages offers more expressive possibilities but as we will see, 



(1) <p{e)=e, 
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Fig. 1.1. The automaton of a (simplified) coffee machine 



they are not properly more expressive than rational languages. 
1.2.2. Automata 

Let us start with a couple of examples. 

Example 1.2. A coffee machine delivers a cup of coffee for €.25. It accepts only 
coins of €.20, €.10 and €.05. While determining whether it has received a sufficient 
sum, the machine is in one of six states, <jo, 90.057 lo.i, <Zo.i5, <7o.2 and 90.25- The 
names of the states correspond to the sum already received. The machine changes 
state after a new coin is inserted, and the new state it assumes is a function of 
the value of the new coin inserted and of the sum already received. The latter 
information is encoded in the current state of the machine. 

Here, the input word is the sequence of coins inserted, and the alphabet consists 
of three letters, w, t and f , standing respectively for twenty cents, ten cents and 
five cents. The machine is represented in Figure ITTTI 

The incoming arrow indicates the initial state of the machine (90)1 and the 
outgoing arrow indicates the only accepting state ((70.25), that is, the state in which 
the machine will indeed prepare a cup of coffee for you. Notice that the machine 
does not return change, but that it will accept sums up to €.40. 

Example 1.3. Our second example (Figure ll.2[) reads an integer, given by its 
binary expansion and read from right to left, that is, starting with the bit of least 
weight. Upon reading this word on alphabet {0, 1}, the automaton decides whether 
the given integer is divisible by 3 or not. 

For instance, consider the integer 19, in binary expansion 10011: our input 
word is 11001. It is read letter by letter, starting from the initial state (the state 
indicated by an incoming arrow, state ro). After each new letter is read, we follow 
the corresponding edge starting at the current state. Thus, starting in state ro, 
we visit successively the states r[, ro, r' , ro again, and finally r[. This state is 
not accepting (it is not marked with an outgoing edge), so the word 11001 is not 
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Fig. 1.2. An automaton to compute mod 3 remainders 

accepted by the automaton. And indeed, 19 is not divisible by 3. 

In contrast, 93 is divisible by 3, which is confirmed by running its binary expan- 
sion, namely 1011101, read from right to left, through the automaton: starting in 
state r , we end in state r' . 

The reader will quickly see that this automaton is constructed in such a way 
that, if n is an integer and w n is the binary expansion of n, then the state reached 
when reading w n from right to left, starting in state tq, is (resp. r' k ) if n is 
congruent to k (mod 3) and w n has even (resp. odd) length. 

We now turn to a formal definition. A (finite state) automaton on alphabet A 
is a 4-tuple A — (Q,T, I, F) where Q is a finite set, called the set of states, T is a 
subset of Q x A x Q, called the set of transitions, and / and F are subsets of Q, 
called respectively the sets of initial states and final states. Final states are also 
called accepting states. 

For instance, the automaton of Example 11.21 uses a 3-letter alphabet, A = 
{f,t,w}. Formally, it is the automaton A = (Q,T, I, F) given by Q = 
{9o,9o.05,9o.i, 9o.i5, 9o.2, 90.25}, I = {90}, F = {90.25} and T is a 15-element subset 
of Q x Ax Q containing such triples as (q , f , 90.05), (90.1, t, 90.2) or (90.2, w, 50.25)- 

As in our first examples, it is often convenient to represent an automaton A = 
(Q, T, I, F) by a labeled graph, whose vertices are the elements of Q (the states) 
and whoses edges are of the form q q' if (q,a,q') is a transition, that is, if 
(q, a, q r ) G T. The initial states are specified by an incoming arrow, and the final 
states are specified by an outgoing edge. 

From now on, we will most often specify our automata by their graphical 
representations . 

Example 1.4. Here, the alphabet is A — {a, b}. Figure IT~3l represents the automa- 
ton A = (Q, T, I, F) where Q = {1, 2, 3}, I = {!}, F = {3} and 



T = {(1, a, 1), (1,6, 1), (1, a, 2), (2, b, 3), (3, a, 3), (3, b, 3)}. 
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b b 




a a 

Fig. 1.3. An automaton accepting A*abA* 



baa 




b 

Fig. 1.4. Another automaton accepting A*abA* 

1.2.2.1. The language accepted by an automaton 

A path in automaton A is a sequence of consecutive edges, 

P = {qo,a 1 ,q 1 )(q 1 ,a 2 ,q 2 ) ■■■ (q n -i, a n , q n ), 

also drawn as 

p = qo — > qi — > 92 • • • — > q n - 

Then we say that p is a path of length n from go to q n , labeled by the word u — 
a\02 ■ • ■ a n . By convention, for each state q, there exists an empty path from q to q 
labeled by the empty word. 

For instance, in the automaton of Figure [T31 the word a 3 ba labels exactly four 
paths: from 1 to 1, from 1 to 2, from 1 to 3 and from 3 to 3. 

A path p is successful if its initial state is in / and its final state is in F. A word 
w is accepted (or recognized) by A if there exists a successful path in the automaton 
with label w. And the language accepted (or recognized) by A is the set of labels of 
successful paths in A. It is denoted by L(A). We say that A accepts (or recognizes) 
L(A). 

For instance, the language of the automaton of Figure [TTT1 is finite, with exactly 
27 words. The automaton of Figure [L3] accepts the set of words in which at least one 
occurrence of a is followed immediately by a b, namely A*abA* , where A — {a, b}. 

Different automata may recognize the same language: if A and B are automata 
such that L(A) = L(B), we say that A and B are equivalent. 

Example 1.5. The language A*abA*, accepted by the automaton in Figure [OJ is 
also recognized by the automaton in Figure 11.41 



A language L is said to be recognizable if it is recognized by an automaton. 
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b a baa 




comp 



b 

Fig. 1.5. Two automata accepting b* a* 



1.2.2.2. Complete automata 

An automaton A = (Q,T, I, F) on alphabet A is said to be complete if, for each 
state q € Q and each letter a £ A, there exists at least one transition of the 
form (q,a,q f ): in graphical representation, this means that, for each letter of the 
alphabet, there is an edge labeled by that letter starting from each state. Naturally, 
this easily implies that, for each state q and each word w ^ A* , there exists at least 
one path labeled w starting at q. 

Every automaton can easily be turned into an equivalent complete automaton. 
If A = (Q, T, /, F) is not complete, the completion of A is the automaton -A C omp = 
(Q',T',I,F) given by Q' — Q U {z}, where z is a new state not in Q, and T' is 
obtained by adding to T all triples (z, a, z) (a S A) and all triples (q, a, z) {q G Q, 
aei) such that there is no element of the form (g, a, q') in T. 

If A is complete, we let -A C omp = A. It is immediate that, in every case, *4 CO mp 
is complete and £(v4 comp ) = L(A). 

Example 1.6. Let A = {a, b}. The automaton B in Figure IT~5l which accepts the 
language b* a* , is evidently not complete. The automaton 6 C omp is represented next 
to it. 



1.2.2.3. Trim automata 

A complete automaton reads its entire input before deciding to accept or reject it: 
whatever input it receives, there is a transition that can be followed. However, we 
have seen that in the completion .4comp of a non-complete automaton A, state z does 
not participate in any successful path: it is in a way a useless state. Trimming an 
automaton removes such useless states; it is, in a sense, the opposite of completing 
an automaton, and aims at producing a more concise device. 

A state q of an automaton A is said to be accessible if there exists a path in 
A starting from some initial state and ending at q. State q is co-accessible if there 
exists a path in A starting from q and ending at some final state. Observe that a 
state is both accessible and co-accessible if and only if it is visited by at least one 
successful path. 

The automaton A itself is trim if all its states are both accessible and co- 
accessible: in a trim automaton, each state is useful, in the sense that it is used in 
accepting some word of the language L(A). 
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Of course, every automaton A is equivalent to a trim one, written .4t r i m , obtained 
by restricting A to its accessible and co-accessible states and to the transitions 
between them. 

Interestingly, ^trim can be constructed efficiently, using breadth-first search. 
One first computes the accessible states of A, by letting Q — I (the initial states 
are certainly accessible) and by computing iteratively 



One verifies that the elements of Q n are the states that can be reached from an 
initial state, reading a word of length at most n; and that if two consecutive sets Q n 
and Qn+i are equal, then Q n = Q m for all m > n, and Q n is the set of accessible 
states of A. In particular, the set of accessible states is computed in at most |Q| 
steps. 

A similar procedure, starting from the final states instead of the initial states, 
and working in reverse, produces in at most \Q\ steps the set of co-accessible states 
of A. The automaton .4 t rim is then immediately constructed. 

Remark 1.1. The construction of Arim, or indeed, just of the set of accessible 
states of A provides an efficient solution of the emptiness problem: given an au- 
tomaton A, is the language L(A) empty? that is, does A accept at least one word? 

Indeed, A recognizes the empty set if and only if no final state is accessible: in 
order to decide the emptiness problem for automaton A, it suffices to construct the 
set of accessible states of A and verify whether it contains a final state. This yields 
an 0(|Q| 2 |A|) algorithm. 

1.2.2.4. Epsilon- automata 

It is sometimes convenient to extend the notion of automata to the so-called e- 
automata: the difference from ordinary automata is that we also allow e-labeled 
transitions, of the form (p, e, q) with p,q e Q. 

Proposition 1.1. Every e-automaton is equivalent to an ordinary automaton. 

Sketch of proof. Let A = (Q, T, I, F) be an e-automaton, and let 1Z be the relation 
on Q given by p 1Z q if there exists a path from p to q consisting only of e-labeled 
transitions (that is: TZ is the reflexive transitive closure of the relation defined by 
the e-labeled transitions of A) . 

Let A' be the (ordinary) automaton given by the tuple (Q 7 T\I',F) with 



Qn+l 



Q„u |J W eQ\(q,a,q>)eT}. 



geQ n ,aeA 



T' = { (p, a, q) | (p, a, q) 6 T and q' 1Z q for some q £ Q} 
I' = {q | p 1Z q for some pel}. 



Then A' 



is equivalent to A. 



□ 
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1.2.3. Deterministic automata 

Example 1.7. Consider the automaton of Figure [L3l say A, and the automaton B 
of Figurc ll.4l Both recognize the language, L — A*abA* , but there is an important, 
qualitative difference beween them. 

We have defined automata as nondeterministic computing devices: given a state 
and an input letter, there may be several possible choices for the next state. Thus 
an input word might be associated with many different computation paths, and the 
word is accepted if one of these paths ends at an accepting state. In contrast, B 
has the convenient property that each input word labels at most one computation 
path. 

These remarks are formalized in the following definition. An automaton A = 
(Q, T, I, F) is said to be deterministic if it has exactly one initial state, and if, for 
each letter a and for all states q, q', q", 



Thus, of the automata in Figures [T73l and ITT4] the second one is deterministic, and 
the first is non-deterministic. 

This definition imposes a certain condition of uniqueness on transitions, that is, 
on paths of length 1. This property is then extended to longer paths by a simple 
induction. 

Proposition 1.2. Let A be a deterministic automaton and let w be a word. 

(1) For each state q of A, there exists at most one path labeled w starting at q. 

(2) If w £ L(A), then w labels exactly one successful path. 

In particular, we can represent the set of transitions of a deterministic automaton 
A = (Q, T, /, F) by a transition function: the (possibly partial) function S : QxA -> 
Q which maps each pair (q, a) G Q X A to the state q' such that (q, a, q') G T (if it 
exists). This function is then naturally extended to the set Q x A*: if q G Q and 
w G A*, 5(q,w) is the state q' such that there exists a path from q to q' labeled 
by w in A (if such a state exists). In the sequel, deterministic automata will be 
specified as 4-tuples (Q, S, i, F) instead of the corresponding (Q, T, {i}, F). We note 
the following elementary characterization of 5. 

Proposition 1.3. Let A = (Q, S,i, F) be a deterministic automaton. Then we have 




'1 



S(q,E) 




u G L{A) if and only if 6(i,u) G F. 



for each state q, each word u G A* and each letter a G A. 
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b a b a 




b b 

Fig. 1.6. The subset automaton of the automaton in Figure lL3l 

Again, it turns out that every automaton is equivalent to a deterministic au- 
tomaton. This deterministic automaton can be effectively constructed, although 
the algorithm - the so-called subset construction - is more complicated than those 
used to construct complete or trim automata. 

Let A = (Q, T, I, F) be an automaton. The subset transition function of A is 
the function S : V(Q) X A —> V(Q) defined, for each P C Q and each a £ A by 

S(P,a) = {q£Q\3p£P, (p.a,q)eT}. 

Thus, S(P, a) is the set of states of A which can be reached by an a-labeled tran- 
sition, starting from an element of P. The subset automaton of A is ,4 su b = 
(V(Q), 5, 1, F suh ) where F sub = {P C Q | P n F + 0}. 

The automaton ,4 su b is deterministic and complete by construction, and the 
subset transition function of A is the transition function of _4 S ub- Moreover, if A 
has n states, then ^ su b has 2™ states. 

Example 1.8. The subset automaton of the non-deterministic automaton of Fig- 
ure 11.31 is given in Figure 11.61 Notice that the states of the second row are not 
accessible. 

Proposition 1.4. The automata A and A su h are equivalent. 

Sketch of proof. Let A = (Q,T, I, F). One shows by induction on \w\ that for all 
PCQ and w £ A*, S(P, w) is the set of all states q £ Q such that w labels a path 
in A starting at some state in P and ending at q. 

Therefore, a word w is accepted by A if and only if at least one final state lies 
in the set 5(1, w), if and only if 6(1, w) £ -F S ut>, if and only if w is accepted by ^l su b- 
This concludes the proof. □ 



In general, the subset automaton is not trim (see Example II. 8 1) and we can find a 
deterministic automaton smaller than A su b, which still recognizes the same language 
as A, namely by trimming ,4 su b ■ Observe that in the proof of Proposition II. 4| the 



September 22, 2011 4:33 



World Scientific Review Volume - 9.75in x 6.5in 



chapl 



14 



H. Straubing and P. Weil 



only useful states of Aub are those of the form 5(1, w), that is, the accessible states 
of Aub- 



automaton is equivalent to A 

Example 1.9. The determinized automaton of the non-deterministic automaton 
of Figure 11.31 consists of the first row of states in Figure 11.61 (see Example 11.81) . 

An obstacle in the computation of Add is the explosion in the number of states: 
if A has n states, then Aub has 2" states. The determinized automaton Act may 
well have exponentially many states as well, but it sometimes has fewer. Therefore, 
it makes sense to try and compute Act directly, in time proportional to its actual 
number of states, rather than first constructing the exponentially large automaton 
Aub and then trimming it. 

This can be done using the same ideas as in the construction of Arim hi Sec- 
tion [12231 One first constructs B, the accessible part of Aub, starting with the 
initial state of Aub, namely /. Then for each constructed state P and each letter 
a, we construct S(P, a) and the transition (P, a, 5(P, a)). And we stop when no new 
state arises this way. 

The second step consists in finding the co-accessible part of B, using the method 
in Section [L2~2~3l 

Example 1.10. Let A = {a, 6}, let n > 2, and let L = A*aA n - 2 . Then L is 
accepted by a non-deterministic automaton A with n states. However, any de- 
terministic automaton accepting L must have at least 2 n_1 states. To see this, 
suppose that (Q,5,i,F) is such a deterministic automaton. Let u,v be distinct 
words of length n — 1. Then one of the words (let us say u) contains an a in a 
position in which v contains the letter b. Thus u — u' 'ax, v — v'by, where \x\ = \y\. 
Let w be any word of length n — 2 — Then uw 6 L, vw L. It follows that 
S(i,u) S(i,v) and thus there are at least as many states as there are words of 
length n — 1. This shows that the exponential blowup in the number of states in 
the subset construction cannot in general be reduced. 

1.3. Logic: Biichi's sequential calculus 

Let us start with an example. 

Example 1.11. Recall that A is the logical conjunction, which reads "AND". And 
V is the logical disjunction, which reads "OR" . We will consider formulas such as 



This formula has the following interpretation on a word u: there exist two natural 
numbers x < y such that, in u, the letter in position x is an a and the letter in 
position y is a b. Thus this formula specifies a language: the set of all words u in 
which this formula holds, namely A*aA*bA*. 



We define the determinized automaton of A to be A, 



dct 



(Aub)trim- This 



3x3y (x < y) A R a x A R b y. 
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1.3.1. First-order formulas 

Let us now formalize this point of view on languages. 
1.3.1.1. Syntax 

The formulas of Biichi's sequential calculus use the usual logical symbols (A, V, -i 
for the negation), the equality symbol =, the constant symbol true, the quantifiers 
3 and V, variable symbols (x, y, z, . . .) and parentheses. They also use specific, non- 
logical symbols: binary relation symbols < and S, and unary relation symbols R a 
(one for each letter a e A). 

For convenience, we may assume that the variables are drawn from a fixed, 
countable, set of variables. 

The atomic formulas are the formulas of the form true, x — y, x < y, S(x,y), 
and R a x, where x and y are variables and a E A. 

The first- order formulas are defined as follows: 

• Atomic formulas are first-order formulas, 

• If <p and tp are first-order formulas, then (-«p), (fAip) and {(p\/ip) are first-order 
formulas, 

• If <p is a first-order formula and if x is a variable, then (3x <p) and (Vx ip) are 
first-order formulas. 

Remark 1.2. As is usual in logic, we will limit the usage of parentheses in our 
notation of formulas, to what is necessary for their proper parsing, writing for 
instance Vx R a x instead of (Vx (R a x)). 

Certain variables appear after a quantifier (existential or universal): occurrences 
of these variables within the scope of the quantifier are said to be bound. Other 
occurrences are said to be free. A precise, recursive, definition of the set FV(ip) of 
the free variables of a formula ip is as follows: 

• If t^j is atomic, then FV(ip) is the set of all variables occurring in <p, 

• FV(^ip) = FV(ip), 

• FV(<p Aip) = FV(<p Vtp) = FV(ip) U FV{ip), 

• FV(3x Lp) = FV(Vx ip) = FV(ip) \ {x}. 

A formula without free variables is called a sentence. 



1.3.1.2. Interpretation of formulas 

In Biichi's sequential calculus, formulas are interpreted in words: each word u of 
length n > determines a structure (which we abusively denote by u) with domain 
Dom(ti) = {0, . . . , n — 1} (Dom(u) = if u = e). Dom(u) is viewed as the set of 
positions in the word u (numbered from 0). 
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The symbol < is interpreted in Dom(u) as the usual order (as in (2 < 4) and 
->(3 < 2)). The symbol S is interpreted as the successor symbol: if x, y € Dom(u), 
then S(x,y) if and only if y = x + 1. Finally, for each letter a € A, the unary 
relation symbol R a is interpreted as the set of positions in u that carry an a (a 
subset of Dom(u)). 



Example 1.12. If u = abbaab, then Dom(u) = {0,1,..., 5}, R a = {0,3,4} and 
Rb = {1,2,5}. 



A valuation on u is a mapping v from a set of variables into the domain Dom(u). 
It will be useful to have a notation for small modifications of a valuation: if v is 
a valuation and d is an element of Dom(w), we let v[x i— > d] be the valuation v' 
defined by extending the domain of v to include the variable x and setting 



If tp is a formula, u € A* and v is a valuation on u whose domain includes the free 
variables of tp, then we define u, v |= tp (and say that the valuation v satisfies tp in 
u, or equivalently u, v satisfies ip) as follows: 

• u, v |= [x = y) (resp. (x < y), S(x,y), R a x) if and only if v(x) = v{y) (resp. 
v(x) < v(y), S(v(x),u(y)), R a v{x)) in Dom(u)\ 

• u, v |= -193 if and only if it is not true that u, v |= (p; 

• u, v |= (ip V ip) (resp. (tpAijj)) if and only if at least one (resp. both) of u, v |= ip 
and u,v\=ijj holds (resp. hold); 

• u, v |= (3x y) if and only if there exists d G Dom(u) such that w, 1— > d] |= ip; 

• u, v |= (Wxip) if and only if, for each e? 6 Dom(u), m, 1— ?> d] \= (p. 

Note that the truth value of u, v |= ip depends only on the values assigned by v 
to the free variables of ip. In particular, if tp is a sentence, then there is a valuation 
fi with an empty domain. We say that ip is satisfied by u (or u satisfies tp), and we 
write u \= tp for u, p, \= ip. Thus each sentence tp defines a language: the set L(tp) of 
all words such that u \= tp. Note that this interpretation makes sense even if u is the 
empty word, for then the valuation /z is still defined: Every sentence beginning with 
a universal quantifier is satisfied by e, and no sentence beginning with an existential 
quantifier is satisfied by e. An early example was given in Example II. 11[ 

Remark 1.3. Two sentences tp and ip are said to be logically equivalent if they are 
satisfied by the same structures. We will use freely the classical logical equivalence 
results, such as the logical equivalence of tpAtp and — 1 ( — V — >"0) ; or the logical equiv- 
alence of Vx tp and ->(3x ->tp). We will also use the implication and bi-implication 
notation: tp — > ip stands for -up V ip and tp ■(-> ip stands for (tp — > ip) A (tp — !> tp). 
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Example 1.13. Let ip and tp be the following formulas. 



ip = 3x (Ny ->(y < x)) A R a xj 
ip = Vx (Ny < x)) -> R a x 



The sentence ip states that there exists a position with no strict predecessor, con- 
taining an a, while ip states that every such position contains an a. The latter 
sentence, like all universally quantified first-order sentences, is vacuously satisfied 
by the empty string. Thus L(ip) — aA* and L(ip) — aA* U {e}. 

The first-order logic of the linear order (resp. of the successor), written FO(<) 
(resp. FO(S')) is the fragment of the first-order logic described so far, where formulas 
do not use the symbol S (resp. <). 

1.3.2. Monadic second-order formulas 

In monadic second-order logic, we add a new type of variable to first-order logic, 
called set variables and usually denoted by upper case letters, e.g. X,Y, ... The 
atomic formulas of monadic second-order are the atomic formulas of first-order logic, 
and the formulas of the form (Xy), where X is a set variable and y is an ordinary 
variable. 

The recursive definition of monadic second- order formulas, starting from the 
atomic formulas, closely resembles that of first-order formulas: it uses the same 
rules given in Section [I.3.1l and the additional rule: 

• If tp is a monadic second-order formula and X is a set variable, then (3Xip) and 
(\/Xip) are monadic second-order formulas. 

The notion of free variables is extended in the same fashion. 

The interpretation of monadic second-order formulas also requires an extension 
of the definition of a valuation on a word u: a monadic second-order valuation is a 
mapping v which associates with each first-order variable an element of the domain 
Dom(u), and with each set variable, a subset of Dom(u). 

If v is a valuation, X is a set variable, and R is a subset of Dom(u), we denote by 
v\X M> R] the valuation obtained from v by mapping X to R (see Section fl.3.1.21) . 

With these definitions, we can recursively give a meaning to the notion that a 
valuation v satisfies a formula tp in a word u (u,v \= ip): we use again the rules 
given in Section ri.3.1.2[ to which we add the following: 

• u, v |= (Xy) if and only if v{y) G v{X); 

• u, v \= (3Xip) (resp. (\/Xip)) if and only if there exists R C Dom(u) such that 
(resp. for each R C Dom(w)) u, v[X i— > R] \= p. 



Note that the empty set is a valid assignment for a set variable: the empty word 
may satisfy monadic second order variables even if they start with an existential 
set quantifier. 
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Biichi's sequential calculus (see Section 11.3.1.2)) is thus extended to include 
monadic second-order formulas. We denote by MSO(<) (resp. MSO(S')) the frag- 
ment of monadic second-order logic, where formulas do not use the symbol S (resp. 
<). Of course, FO(<) and FO(S') are subsets of MSO(<) and MSO(S'), respectively. 

Example 1.14. Inspecting the following MSO(<) sentence, 

cp = 3X [Vx (Xx o ((Vy -.(a: < y)) V (Vy -.(» < x)))) 
A Vx (Xx R a x) A 3x Xx] . 

one can see that the elements of X must be the first and last positions of the word 
in which we interpret cp, so L(<p) = aA* DA* a. This language can also be described 
by a first order sentence, see Example 1 1.13[ that is: this formula is equivalent to a 
first-order formula. 

Example 1.15. We now consider the more complex formula 

if = 3X ((\fx Vy {{x < y) A (Vz -.((a; < z) A (z < y)))) -> (Xx o -njy)) 
A (Va; (Vy ->(y < x)) -> Xx) 
A (Vx (Vy ->(x < y)) -> -^Xx)). 

The formula ip states that there exists a set X of positions in the word, such that a 
position is in X if and only if the next position is not in X (so X has every other 
position), and the first position is in X, and the last position is not in X. Thus 
L(ip) is the set of words of even length. It is an easy consequence of the results of 
Section [T77l that this language cannot be described by a first-order formula. 

The successor relation can be expressed in FO(<): <5'(x, y) is logically equivalent 
to the following formula: 

(x<y) A Vz ((x <z)^ ((y = z) V (y < z))). 

In a weak converse, the order relation < can be expressed in MS0(5): the formula 
x < y is equivalent to: 

3X (Xy A ~^Xx A [Vz W ((Xz A S(z, t)) -> Xt)}) . 

It follows that MSO(<) and MS0(5) have the same expressive power. 

Proposition 1.5. A language can be defined by a sentence in MSO(S), if and only 
if it can be defined by a sentence in MSO(<). 

However, the order relation < cannot be expressed in F0(5). This is a non-trivial 
result; for a proof, see [21] . 

Proposition 1.6. If a language can be defined by a sentence in FO(S), then it can 
be defined by a sentence in FO(<). The converse does not hold. 
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1.4. The Kleene-Biichi theorem 

In this section, we prove the following theorem, a combination of the classical Kleene 
and Biichi theorems. 

Theorem 1.1. Let L be a language in A* . The following conditions are equivalent: 

(1) L is defined by a sentence in MSO(<); 

(2) L is accepted by an automaton; 

(3) L is extended rational; 

(4) L is rational. 

1.4.1. From automata to monadic second-order formulas 

Let A = (Q,i,5,F) be a deterministic automaton. The idea is to associate with 
each state q G Q a second order variable X q , to encode the set of positions in which 
a given path visits state q. What we need to express about the sets X q is the 
following: 

• the sets X q form a partition of the set of all positions (at each point in time, 
the automaton must be in one and exactly one state); 

• if a path visits state q at time x, state q' at time x + 1 and if the letter in 
position x + 1 is an a, then 5(q, a) = q'; 

This analysis leads to the following formula. For convenience, let Q be the set 
{qo, qi, ■ ■ ■ , q n }, with initial state i = qo- We also use the shorthand min and max 
to designate the first and last positions: this is acceptable as these positions can be 
expressed by FO(S')-formulas. For instance, R a min stands for Vir (Vy ^S(y, x) — > 
R a x); and X max stands for Va; (Vy ^S(x, y) — > Xx). 



This sentence is actually verified by the empty word, so the language it defines 
coincides with L(A) on A + . If qo 6 F, it accurately defines L(A). But if qo £ F, 
we must consider the conjunction of this sentence with 3x true. 

This is a sentence in MS0(5, <) but as we know, it is logically equivalent to one 
in MSO(<). Note that it is in fact an existential monadic second order sentence, 
that is, the second-order quantifications are all existential. 



3X qo 3X qi 



A 




q&Q. aeA 



A 
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1.4.2. From formulas to extended rational expressions 

The proof that an MSO(<)-dcfinablc language can be described by an extended 
rational expression, is more complex. The reasoning is by induction on the recur- 
sive definition of formulas. Instead of associating a language only with sentences 
(formulas without free variables) , we will associate languages with all formulas but 
these languages will be over larger alphabets, which allow us to encode valuations. 

1.4.2.1. The auxiliary alphabets B Piq 

Let p, q > and let B p ^ q = A x {0, 1} P x {0, l} q . A word over the alphabet B VA 
can be identified with a sequence (uo, u\, . . . , u p , u p+ \, . . . , u p+q ) where u E A*, 
Mi, ... , Up, Mp+i, . . • , u p+q 6 {0, 1}* and all the Ui have the same length. 

Let K p . q consist of the empty word and the words in B+ q such that each of 
the components u\,...,u p contains exactly one occurrence of 1. Thus each of 
these components really designates one position in the word u , and each of the 
components w p +i, . . . , u p+q designates a set of positions in u . 

Example 1.16. If A = {a, b}, the following is a word in K 2 ,i- 

u a b a a b a b 

ui 1 
u 2 1 

u 3 1 1 1 1 

Its components ui and Ui designate positions 4 and 2, respectively, and its compo- 
nent u 3 designates the set {1, 2, 5, 6}. 

The languages K Ptq are extended rational. Indeed, for 1 < i < p, let C, be the 
set of elements (bo, b\, . . . , b p+q ) G B p q such that b; t = 1. Then K Piq is the set of 
words in B* which contain at most one letter in each d: 

K p-q — { £ } u fl ( B p,q \ Ci)*Ci(B P , q \ C\)* — B* q \ [J B p>q CiB p q CiB* q . 

l<i<p l<i<p 

1.4.2.2. The language associated with a formula 

Let now ip(xi, . . . , x r , X\, . . . , X s ) be a formula in which the free first order (resp. 
set) variables are x\, . . . , x r (resp. X\, . . . , X s ), with r < p and s < q. 
We interpret 

• R a as R a = {i e Dom(u) | u (i) — a}; 

• Xi as the unique position of 1 in m (if Uj ^ e); 

• Xj as the set of positions of 1 in u p+ j . 
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Note that if p = q — 0, then ip is a sentence and this is the usual notion of 
interpretation. 

More formally, let (uo,u\, . . . ,u p+q ) be a non-empty word in K PA . Let n, be 
the position of the unique 1 in the word u, and let Nj be the set of the positions 
of the l's in the word u p +j. We say that u = (uq, u\, . . . , u p+q ) G K p q satisfies ip if 
Mo, v satisfy ip where v is the valuation defined by 

v(x{) = rii for 1 < i < r and v(Xj) = Nj for 1 <j < s. 

We also say that the empty word (in K ps ) satisfies ip if e \= <p. We let L P:q {<p) — 
{u G K Pt q | u satisfies ip}. Thus each formula ip defines a subset of K PiQ , and hence 
a language in B* q . 

Example 1.17. Let p = 3x (x < y A R a y). Then FV(p) = {y}. And L lt0 (p) is 
the set of pairs of words (uo, u\) such that u € A*, u\ G {0, 1}*, u n and u\ have 
the same length, u\ has a single 1, which is not the first position, and u has an a 
in that position. 

Let <p> = Vx ((Xi A x < y A Rty) — > i? a;). Then Li ;1 (<^) is the set of triples 
of words (wo) Wi, W2) w hh «o G ^4*, wi,U2 G {0, 1}*, all three words have the same 
length, and either this length is zero, or u\ has a single 1 such that: 

Let n be the position in u\ which has a 1. If uq has a b in position n, 

then uo has an a in each position before n in which w 2 has a 1. If u docs 

not have a 6 in position n, then there is no constraint. 

1.4.2.3. The MSO(<)- definable languages are extended rational 

We first consider the languages associated with an atomic formula. Let 1 < i,j < 
p + q and let a G A. Let 

CjM = {be B Ptq I bj = 1 and b = a}, 
Qj = {b G B p , 9 I h = bj = 1}, 
and d — {b G B Pig | &, = 1}. 

Then we have 

T p ,q(R a %i) — -^P,Q ^ B p qCi^aB p ,q 

L p , q {xi = Xj) = K Piq n B* g CijB* q 
L Ptg ( Xi < Xj ) = K Ptq n B* ,CjB* , 
Lp,q{XiXj) = K PA n B* q C i+p ^B* q . 

Thus, the languages defined by the atomic formulas, namely L p ^ g (R a x), L PA {x = y), 
L p , q {x < y) and L Pyq (Xy), are extended rational. 
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Now let ip and ip be formulas and let us assume that L pq (ip) and L pq (ip) are 
extended rational. Then we have 

Lp, q (<p V V) = L p>q (ip) U L p<q (ip) 

L p , q ((p A 1p) = L PtQ ((p) n Lp t q(l/}) 
Lp,q{-^V) = K Pt q \ L Pt q(tp), 

and hence these three languages are extended rational as well. We still need to 
handle existential quantification. 

Let 7Tj be the morphism which deletes the i-th component in a word of B* ; that 
is: if 1 < i < p, then tt, : B* q -> B*_ lq , and if p < i < p+q, then n l : B* q — > B* q _ 1 . 
In either case, we have 7r,(&o, h, . . . , b p+q ) = (6 , h, ■ ■ ■ , h-iM+x, ■■■■> b p + q ). 

Now, observe that, for any formula (p(x\, . . . ,x r ,X\, . . . ,X S ), and for p > r, 
q> s, 1 < i < p and 1 < j ' < q we have 

Lp-i <q (3xitp) = Ki(Lp >g (tp)) and Lp iq -i(3Xjip) = w p+j (L Ptq (ip)). 

This concludes the proof that L Piq (ip) is extended rational for any p > r, q > s. 

In particular, if (p is a sentence in MSO(<) (that is, (p has no free variables), we 
may take p — q = 0. Then Lo,o(<p) is extended rational - and we already noted that 
L(<p) = L ,o((p). 

1.4.3. From extended rational expressions to automata 

It is immediately verified that the languages 0, {e}, {a} (a G A) are accepted by 
finite automata. We now need to show that if K, L C A* are recognizable and 
if 7r: A* -4 S* is a morphism, then L, K U L, K D L, KL, K* and n(L) are 
recognizable. 

Proposition 1.7. If L C A* is recognizable, then the complement L of L is recog- 
nizable as well. 

Proof. Let A = (Q,6,i,F) be a deterministic complete automaton recognizing 
L. Then A = (Q, S, i, F) recognizes L by Proposition 1 1.31 □ 

Example 1.18. The deterministic automata in Examples 11.51 and 11.61 confirm that, 
if A = {a, b}, then b*a* is the complement of A*abA*. 

Note that the resulting procedure yields a deterministic automaton for L. It 
is very efficient if L is given by a deterministic automaton, but may lead to an 
exponential growth in the number of states if L is given by a non-deterministic 
automaton. 

Proposition 1.8. If K, L C A* are recognizable, then K U L and K n L are recog- 
nizable as well. 
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Proof. Let A = (Q,T,I, F) and A' = (Q', T',I',F') be automata recognizing L 
and L', respectively. We assume that the state sets Q and Q' are disjoint. Then it 
is readily verified that the automaton 

AUA' = {QUQ',TUT',IU I', F U F') 

accepts LUL'. Thus L U L' is recognizable, and hence so is L n L' = L U L', by 
Proposition 11.71 □ 

The construction in the above proof always yields a non-deterministic automaton 
for LUL', even if we start from deterministic automata for L and L' . The product of 
automata provides an alternative construction which preserves determinism, avoids 
any exponentiation of the number of states, and works for both the union and the 
intersection. 

Let A = (Q,T,I,F) and A' = (Q' ,T' ,1' ,F') be automata recognizing the lan- 
guages L and L' . Their cartesian product is the automaton A" = (Q X Q',T",I x 
V, F x F') where 

T" = {((p,p'),a,(q,q')) | {p,a,q) e T and (p',a,q') G T'}. 

Note that if .4 and A' are deterministic, then A" is deterministic as well. The main 
property of A" is the following: there exists a path (p,p') —^4 {q,q') in .4" if and 
only if there exist paths p — ^4 g and p' — ^ q', in .4 and A' respectively. Therefore 
A" recognizes Lf)L'. 

If we take (F x Q') U(Qx F') as the set of final states, instead of F x F', and 
if the automata A and .4' are complete, then the product automaton recognizes 
LUL'. 

In practice, the cartesian product of A and A' may not be trim, and one may 
want to use the procedure in Section 11.2.2.31 to produce more concise automata for 
LDL' and LUL'. 

Remark 1.4. Let us record here an algorithmic consequence of Propositions 11.71 
and 11.81 given two automata A and B, it is decidable whether L(A) Q L(B) and 
whether L{A) — L{B). Indeed, we can compute automata accepting L(A) \ L(B) — 
L(A) l~l L(B) and L(B) \ L(A), and decide whether these languages are empty (see 
Remark 0)l . 

Proposition 1.9. If L,L' C A* are recognizable, then LL' and L* are recognizable 
as well. 

Sketch of proof. Let A = (Q,T,I,F) and let A' = (Q',T',I',F') be automata 
accepting L and L', respectively, and let us assume that their state sets are disjoint. 
It is easily verified that the e-automaton 



(Q UQ',TUT'u(Fx {e} x I'), I, F') 
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accepts LL' (see Section H.2.2.4p . Similarly, if j is a state not in Q, the e-automaton 

(Q U {j}, ru(Fx{e}x/),/U {j}, F U {j}) 
accepts L* . □ 

Proposition 1.10. // L C A* is recognizable and tp: A* — > B* zs a morphism, then 
<p(L) is recognizable as well. 

Sketch of proof. Let A = (Q,T,I, F) be an automaton recognizing L. We let .A' be 
the £-automaton A' = (Q U Q', T', /, F), where the set T" consists of 

- the transitions of the form (p, e, g) such that (p, a. q) £ T for some letter o with 

(/3(a) = e, 

- the transitions occurring in the paths of the form 

bl, i b 2% i b k 

p — >qi — >■■■ %-i — > q 

such that (p, a, q) G T, ip(a) = bi ■ ■ ■ bk ^ e and q[, . . . , q' k _ 1 are new states that 
we adjoin for each such triple (p, a, q). 

The set Q' contains all the new states that occur in the latter paths. It is elementary 
to verify that A' recognizes (p(L). □ 



So far, we have shown that a language is recognizable, if and only if it is defined 
by a sentence in MSO(<), if and only if it is extended rational. 

Remark 1.5. Note that the proofs of this logical equivalence are constructive, in 
the sense that given a sentence ip in MSO(<), we can construct an automaton A such 
that L(ip) = L(A). It follows that MSO(<) is decidable: given an MSO sentence 
ip, we can decide whether <p always holds. Indeed, this is the case if and only if 
L(-iyp) = 0, which can be tested as discussed in Remark [L~T1 



1.4.4. From automata to rational expressions 

To complete the proof of the Kleene-Biichi theorem, it suffices to prove that ev- 
ery recognizable language is rational. For this, we use the McNaughton-Yamada 
construction. 

Let A = (Q,T,I, F) be an automaton. For each pair of states p,q G Q and for 
each subset P C Q, let L PtQ (P) be the set of all words u £ A* which label a path 
from state p to state q, such that the states visited internally by that path are all 
in P: 

L p ,q{P) = {a\a2 ■ ■ ■ a n £ A* \ there exists a path in A 

p qi . . .q n -i ^> q with q x , . . .,q n -\ £ P}. 
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Recall that, by convention, there always exists an empty path, labeled by the 
empty word, from any state q to itself. So e G L p ^ q (P) if and only if p = q. 

We show by induction on the cardinality of P that each language L p , q (P) is 
rational. This will prove that L(A) is rational, since L(A) = [J ieI f eF Li j(Q). 

If P = 0, then L p . q {%) = {a E A \ (p, a, q) E T} if p ^ q, and L q , q {$) = {a E A \ 
(q,a,q) E T} U {e}. Thus L p . q ($) is always finite, and hence rational. 

Now let n > and let us assume that, for any p,q E Q and PCQ containing 
at most n — 1 states, the language L Pj9 (P) is rational. Let now P C Q be a subset 
with n elements and let r E P. Considering the first and the last visit to state r of 
a path from p to q, we find that 

L p . q (P) = L M {P\{r}) U L p ,.{P\{r})L r AP\{r}TL r . q {P\{r}). 

Since P \ {r} has cardinality n — 1, it follows from the induction hypothesis that 
L Ptq {P) is rational. 

This concludes the proof of the Kleene-Biichi theorem. 

1.4.5. Closure properties 

Rational languages enjoy many additional closure properties. 

Proposition 1.11. Let ip: A* — > B* be a morphism and let L C B* . If L is 

rational, then ( / 9 _1 (L) is rational as well. 

Sketch of proof. Let A — (Q,T, I, F) be an automaton over B, recognizing L, and 
let A' — (Q, T', I, F) be the automaton over A where 

T' = {(p, a,q) | p ^-l q is a path in „4}. 
It is readily verified that A' recognizes <p _1 (L). □ 

Let u E A* and L C A* . The left and right quotients of L by u are defined as 
follows: 

u^ 1 L = {v E A* | uv E L}; 
Lu^ 1 = {v E A* | vu E L}. 

These notions are generalized to languages: if K and L are languages, the left and 
right quotients of L by K are defined as follows: 

K^L = {v E A* | 3uEK such that uv E L} = {J u^L, 

uEK 

LK^ 1 = {v E A* \3uE K such that vu E L} = (J Lu^ 1 . 

uEK 

Proposition 1.12. If L C A* is rational and K C A* is any language (possibly 
not rational), then K^ 1 L and LK^ 1 are rational as well. 
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Sketch of proof, li A = (Q,T, I, F) is an automaton recognizing L. Let I' be the 
set of states of A which are accessible from an initial state of A following a path 
labeled by a word of K, 

I' = {q G Q | Bi G I, Bu G K such that i qj. 

Then one shows that A' — (Q,T,I',F) recognizes K~ X L. The proof for LK^ 1 is 
similar. □ 

Remark 1.6. The proof of Proposition 11.121 is not effective: we may not be able 
to construct the set of states I' associated with K. However, if K is rational too, 
then I' is effectively constructible. 

Recall that a word u is a prefix of the word v if there exists a word v' G A* such 
that v — uv' (that is: v "starts" with u). Similarly, u is a suffix of v if there exists 
a word v' G A* such that v — v'u. Finally u is a factor of v if there exist words 
v',v" G A* such that v = v'uv". 

If L is a language, we let Pref(L) (resp. Suff(i), Fact(L)) be the set of all 
prefixes (resp. suffixes, factors) of the words in L. 

Proposition 1.13. If L C A* is rational, then Pref(L), Suff(L) and Fact(L) are 
rational as well. 

Proof. The result follows from Proposition I1.12[ since Pref(i) = L(A*)^ 1 , 
Suff(L) = (A*)~ 1 L and Fact(-L) = (A*)- 1 L(A*)- 1 . □ 

We leave it to the reader to verify that the following operations also preserve 
rationality. 

The mirror image of a word u — a\ . . . a n G A* is the word u = a n . . . a%. The 
corresponding language operation is given by L — {u \ u G L} for each L <Z A*. 

A word u = a% . . . a n G A* is a subword of a word v G A* if there exist words 
Mo, . . . , u n G A* such that v — uqoiui . . . a n u n . If X C A", we let SW(L) be the set 
of all subwords of the words of L. 

The shuffle of the words u and v is the set 

u LU v = {w G A* \3ui, vi, . . . , u n , v n G A* such that 

u = ux ■ • ■ u n , v — vi ■ ■ ■ v n and w = u\Vi ■ ■ ■ u n v n }. 
If K and L are languages, we let K LU L = {J ueK veL u LU v. 

Proposition 1.14. Let K, L C A* be rational languages. Then L, SW(L) and 
K LU L are rational as well. 

1.5. Pumping lemmas 

The characterizations summarized in the Kleene-Biichi theorem are sufficient most 
of the time to show that a language is rational. Showing that a language is not 
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Fig. 1.7. Proof of the pumping lemma 



rational is a trickier problem. This short section presents the main tool for that 
purpose, namely the pumping lemma. We actually first present a rather abstract 
version of this statement, and then its more classical corollaries. 

Theorem 1.2. Let L be a rational language. There exists an integer N > with 
the following property. For each word w £ L and for each sequence of integers 
< io < i\ < ■ ■ ■ < ijv < \w\) there exist < j < k < N such that, if w = M1M2U3 
with \ui \ = ij and \u±U2\ = ik, then U1U2U3 C L. 

Proof. Let A be an automaton recognizing L, and let N be the number of states 
of A. Let w = aiQ2 ■ ■ ■ a n € L and let 



be a successful path in A labeled w. Let < io < i\ < ■ ■ ■ < i^ < n be a sequence 
of integers. Then two of the states Pi ,Pi r , ■ ■ ■ ,Pi N are equal, that is, there exist 
< j < k < N such that p^, — pi k . 

Let u\ — a\ ■ ■ ■ ai j , U2 = ai+ij ■ • • ai k and 113 = ai+j fe • • • a n . Of course, w = 
U1U2U3, \u±\ — ij, \uiU2\ = ik- The situation is summarized by Figure [T77l we may 
iterate or skip the loop labeled U2 and still retain a successful path, so uiu^u^ C L. 



Corollary 1.1. Let L be a rational language. There exists an integer N > such 
that, for each word w £ L with length \w\ > N, we can factor w in three parts, 
w = U1U2U3, with U2 7^ £ and UiU^u^ C L. 

Corollary 1.2. Let L be a rational language. There exists an integer N > such 
that, for each word w £ L with length \w\ > N, we can factor w in three parts, 
w = U1U2U3, with 1*2 7^ e, IU1U2I < N (resp. \u2Ug\ < N) and uiu^u^ C L. 

Sketch of proof. To prove Corollary 11.21 we apply Theorem 11.21 with ij = j (resp. 
ij = n — N+j) for < j < N. And to prove Corollarv ll.il we take any sequence. □ 

Example 1.19. It is a classical application of Corollary 11.11 that {a n b n \ n > 0} is 
not rational: for each N > 0, the word a N b N cannot be factored as w = U1U2U3 
with u 2 ^ e and W1M2U3 C {a n b n \ n > 0}. 

Corollary II .21 can be used to show that {u £ {a, b}* \ \u\ a = \u\b} is not rational 
(take again a N b N ); however, this language satisfies the necessary condition for 
rationality in Corollarv ll.il with N = 2. 




Pn 



□ 
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Fig. 1.8. Two different automata for A*aaaA* 



Consider now the following language over the alphabet {a, b, c, d} 

{(ab) n (cd) n | n > 0} U A*{aa, bb, cc, dd, ac}A* 

It satisfies the necessary condition for rationality in Corollary II. 2\ but it is not 
rational, as can be proved using Theorem II .21 

However, the pumping lemma as stated here may not be enough to prove that 
a given language is not rational. Let us say that a word contains a square if it can 
be written in the form uvvw with v ^ e. Then the language 

{udv | u, v £ {a, b, c}* and either u ^ u, or one of u and v contains a square} 

satisfies the necessary condition for rationality in Theorem 11.21 (for N = 4). Yet it 
is not rational (the proof of that fact uses the existence of arbitrarily long words on 
the alphabet {a, b, c} containing no square). 

Ehrenfeucht, Parikh, Rozenberg gave a necessary and sufficient condition for 
rationality in the same style as the pumping lemma (see e.g. [18| Theorem 1.3.3]). 

1.6. Minimal automaton and syntactic monoid 

Consider the two automata in Figure 11.81 Both are complete and deterministic, 
and both recognize the set of words over A — {a, b} that contain some occurrence 
of the word aaa as a factor — that is, the language A*aaaA* . The two automata 
were designed using different intuitions about how to go about this task: In the 
first instance, the underlying algorithm is "keep track of the last two letters read 
from the input" , as indicated by the state labels, while in the second automaton the 
algorithm is, "keep track of the length of the longest suffix of a's in the input" . Thus 
the second automaton achieves the same result with a smaller number of states. It 
is easy to see that the second example is also optimal — no complete deterministic 
automaton recognizing this language can have a smaller number of states. 
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In this section we will see that for every rational language L there is a unique 
minimal complete deterministic automaton accepting L. We will also describe an ef- 
ficient algorithm that takes as input an arbitrary complete deterministic automaton 
A, and produces as output the minimal automaton for L(A). 

1.6.1. Myhill-N erode equivalence and the minimal automaton 

One way to see that there is something inefficient about the first automaton in 
the example above is to observe its behavior on the two input words u = bab and 
v = abb. These words lead from the initial state to two different states. However, 
for purposes of recognizing words in L, there is no point in distinguishing between 
u and v, for no matter what the subsequent input w is, the result will be the same: 
either uw and vw are both in L or both outside of L. 

To formalize this notion of inputs that are indistinguishable with respect to L, 
we make the following definitions: If u,v € A* we define u =l v if and only if 
U~ l L = v~ x L (see Section ri.4.5j) . Obviously, =l is an equivalence relation on A*. 
We also note that if u =l v, and w G A*, then uw =l vw, since {uw)" 1 L — 
w _1 (u _1 L). An equivalence relation with this multiplicative property is said to be 
a right congruence. Further, L itself is a union of =L-classes, since w 6 L if and 
only if e G w~ l L. 

We can accordingly define a complete deterministic automaton A m - m (L) by mak- 
ing the states these classes of equivalent words: We set A m i n (L) = (Ql, Sl^l, Fl), 
where Ql = A* / =l, ii = [e]= L , and Fl and Sl ■ Ql x A — > Ql are defined by 

F L = {[v]= L \ veL} and 6([v]= L ,a) = [va]= L . 

We need to show that this is well-defined, since a state will in general have many 
different representations of the form [v]=i,- But well-definedness is an immediate 
consequence of our observation that =l is a right congruence. We have the following 
result. 

Theorem 1.3. Let L C A* . 

(1) A m \n(L) accepts L. 

(2) L is rational if and only if =l has finite index. 

Proof. It follows at once by induction on \w\ that for all w G A*, 

5 L ([e]= L ,w) = [w]= L . 

Since, as observed above, L itself is a union of =L-classes, it follows that w is 
accepted if and only if w € L. This proves the first claim. 

To prove the second claim in the theorem, note that if =l has finite index, then 
_4 m j n is a finite automaton, and therefore by (1), L is rational. Conversely, if L is 
rational, then it is accepted by some complete deterministic automaton (Q,5,i,F) 
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with Q finite. Now suppose u,v G A* and 5(i,u) = S(i,v). Then if w G A* and 
UW G L, we have 

5{i, vw) — 5(5(i, v),w) = 5(5(i, u),w) = 5(i, uw) G F , 

so vw G L. Similarly, vw G L implies uw G L, so u =l v. Thus the number of 
classes of =l cannot be more than \Q\, so =l has finite index. □ 

The proof of Theorem 11.31 shows that ^4 m i n (L) has the least number of states 
among the complete deterministic automata accepting L. The automaton A m i n (L) 
is called the minimal automaton of L. We now give another, more algebraic justi- 
fication for this terminology. 

1.6.2. Uniqueness and minimality of A. m i n (L) 

Let A = (Q,S,i,F) be a complete deterministic automaton over A, and let L = 
L{A). We say that p,q G Q are equivalent states, and write p = q, if 

{v G A* | 5(p, v) e F} — {v e A* \ S(q, v) G F}. 

Intuitively, this means that for purposes of recognizing words in L, p and q do the 
same job, and we might as well merge them into a single state. 

We now repeat, in a somewhat different form, an observation made in the proof 
of Theorem ll.3l If S(i, u) = S(i, v), then 

uw G L <==^ S(S(i, u),w) G L 4=> 6(d(i, v),w) G L <J=^ vw G L, 

so that u =l v. In particular, if 5(i,u) = S(i,v), then u =l v, so we have a well- 
defined mapping 6(i,w) i-> [u>]= [; from the set of accessible states of A onto the 
states of A m i n (L). Note that this mapping sends the initial state i = 5(i, e) to [e]= L , 
final states of A to final states of A m i n (L), and respects the next-state function. 
We summarize these observations as follows. 

Theorem 1.4. Let A — (Q,5,i, F) be a complete deterministic automaton over A, 
and let L = L(A). Then there is a map f from the set of accessible states in Q onto 
Ql such that 

• for all a G A and accessible q G Q, f(S(q,a)) — (5i(/(q),a) ; 

• /(£) = ih, 

• f(F)=F L . 

Moveover, f(p) — f{q) if and only if p = q. 

In particular, if A has the same number of states as .A min (L), then since / is onto, 
the two automata are isomorphic by Theorem II .41 
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1.6.3. An algorithm for computing the minimal automaton 

Theorem 11.41 says that in principle we can compute the minimal automaton of a 
rational language L starting from any complete deterministic automaton (Q, 5, i, F) 
accepting L, first by removing the inaccessible states and then merging equivalent 
states. We have already seen how to compute the accessible states. How do we 
determine if two states are equivalent? If p, q are inequivalent states then there is 
a word v G A* that distinguishes between these states in the sense that 6(p, v) G F 
and 5(q, v) ^ F, or vice-versa. It follows from a simple pumping argument that 
if such a distinguishing word exists, then it can be chosen to have length no more 
than \Q\ 2 . Thus we can effectively determine whether two states are equivalent by 
calculating S(p, v) and 5(q, v) for all words up to this length. 

Of course, this is a terrible algorithm, since there are \A\\®\ different words to 
check! In practice, we can proceed as follows: Let to > 0. We say p = m q if for all 
v G A* of length no more than to, S(p,v) G F if and only if S(q,v) G F. This is 
clearly an equivalence relation on A* , and = m +i refines = m for all to. The following 
lemma improves the \Q\ 2 bound on the length of distinguishing words. 

Lemma 1.1. Let p,q G Q. Then p = q if and only if p = m q for to = \Q\ — 2. 

Proof. First suppose that for some to, the equivalence relations = m and = m +i 
coincide. We claim that = m and = coincide. To see this, suppose that p and q 
are inequivalent, and that w is a word of minimal length distinguishing them. If 
| to | > to, then we can write w = uv, where \v\ = m + 1, so that p' = 8{p,u) and 
q' = S(q,u) are inequivalent modulo = m+ \. But this means that they are also 
inequivalent modulo = m , and thus distinguished by a word v' of length no more 
than to, and thus p and q are distinguished by the word uv' of length strictly less 
than that of w, a contradiction. Thus the minimal distinguishing word has length 
no more than to, so that = m coincides with =. 

Now if = m +i does not coincide with = m , then = m +i has a larger number of 
classes. Since the number of classes can never exceed \Q\, and since =o has two 
classes, the sequence {= m } m >o will stabilize by the time to reaches \Q\ — 2. □ 

Lemma fl . 1 H eads to the following practical algorithm for minimization. We begin 
with a list of all the pairs {p, q} of distinct accessible states, and mark the pair if 
p G F and q ^ F, or vice-versa. In each phase of the algorithm, we visit each 
unmarked pair {p, q} and each a G A, we compute {p',q'} = {S(p,a),5(q,a)}, and 
we mark {p, q} if {p\ q'} is marked. An easy induction shows that if a pair {p, q} 
is distinguished by a word of length to, then it will be marked by the m th phase of 
the algorithm. Thus after no more than \Q\ — 2 phases, the algorithm will not mark 
any new pairs, with the result that the algorithm terminates, and the unmarked 
pairs are exactly the pairs of equivalent states. 

Example 1.20. Consider the first automaton in Figure H~9l Initially we mark the 
pairs {i,j}, where i G {1,2,3} and j G {4,5,6}. On the next pass, the pairs 
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Fig. 1.9. The minimization algorithm 
b b b b 



a ^3) ^^4^ " 




Fig. 1.10. A minimal automaton 

{4,6} and {5,6} are marked since applying b to these pairs gives the marked pair 
{3, 6}. No further pairs are marked on the next pass, so the algorithm terminates. 
Since the pairs {1,2} and {2,3} are unmarked, {1,2,3} is an equivalence class, and 
since {4,5} is unmarked, it forms a second class. The remaining class is {6}. The 
resulting minimal automaton is pictured on the right-hand side of Figure 11.91 



Example 1.21. We now apply the algorithm to the automaton in Figure 11.101 
Initially, the pairs {i,6} with i < 6 are marked. On the next pass the pairs {i,5} 
with i < 5 are marked, etc., until on the fifth pass the pair {1,2} is marked. The 
result is that every pair of distinct states is marked: the automaton is already 
minimal. 

The pair-marking implementation of the algorithm just illustrated is suitable 
for small examples worked by hand. In the worst case, shown in the last example, 
we check 0(|Q| 2 ) unmarked pairs on each pass, and make 0(|<5|) passes, with 
\A\ consultations of the state-transition table for each pair we inspect. Thus, the 
overall time complexity of the algorithm is 0(|^4 • |Q| 3 ). More astute bookkeeping, 
in which we partition equivalence classes at each step, rather than marking pairs 
of inequivalent states, leads to a 0{\A\ ■ \Q\ 2 ) algorithm (Moore [12]). This can be 
further improved to 0(|A| • \Q\ ■ log \Q\) (Hopcroft [6]). 



1.6.4. The transition monoid of an automaton 

Let A — (Q,6,i,F) be a complete deterministic automaton over an alphabet A. 
Let w G A*. We study the maps 

fw ■ «' — ► s(q,w) 
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a 




Fig. 1.11. The automaton Ai, with no indication of initial or terminal states 

from Q into itself. We will write the image of a state q under f£ as qf£ rather 
than the more traditional f^(q). We then have, for v,w £ A*, 

fA _ i A fA 

J vw J V J w i 

where the product in the right-hand side of the equation is left-to-right composition 
of functions — that is, q(f*f£) = (qtf)f£- 

We will henceforth drop the superscript A, except in situations where several 
different automata are involved. Observe that f e is the identity map on Q. Thus 
the set of maps 

M(A) = {f w \we A*} 

forms an algebraic structure with an associative product and an identity element 
(usually denoted 1). Such a structure is called a monoid, and we call M(A) the 
transition monoid of A. Observe that if Q is finite, then M(A) is finite, and that 
the structure of M(A) depends only on the next-state function 5, and not at all on 
the initial or final states. 

A* is, of course, itself a monoid, with concatenation of words as the operation 
and the empty word e as the identity. The map 

ip : w i — >■ f w 

is consequently a monoid morphism from A* into M(A); that is, it satisfies 

ip(w 1 w 2 ) = (p(wi)cp(w 2 ) 

for all till, W2 in A* , and it maps the identity element of A* to the identity element 
of M{A). 

Example 1.22. In the diagrams in this example and in Examples 11.231 and 11.241 
we indicate only the transitions between states, since, as we have observed, the 
initial and final states do not enter into the computation of the transition monoid 
of an automaton. 

First, consider the automaton A\ in Figure 11.111 We will write an element f w 
of M(A\) as a vector f w = (lf w 2f w 3f w ). We can then begin enumerating the 
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a, b a,b,c 




Fig. 1.12. The automata An and JK'i 

elements of M{Ai): 

/e = (12 3) 

f a = (2 33) / 6 = (3 13) 

/ oa = (3 3 3) /„6 = (13 3) / ha = (3 2 3) / fcb = (3 3 3) 

We could continue enumerating like this, but instead we note that f a b a = fa, fbab = 
fb, and for all other w G A* of length 3, f w = (3 3 3). Thus the inventory above 
is the entire transition monoid, since any transition induced by a word of length 
greater than 2 is equal to one induced by a shorter word. Thus M(A) has 6 elements 
1, a = f a , fi = fb, afi, fia, and 0. The multiplication is then determined by the laws 
aa = f3f3 — 0, a — a/3a, and /? = fia.fi. The complete multiplication table is shown 
below: 





1 


a 


fi 


afi 


fia 





1 


1 


a 


fi 


afi 


fia 





a 


a 





afi 





a 





P 


fi 


fia 





fi 








afi 


afi 


a 





afi 








fia 


fia 





fi 





fia 



























This example illustrates an important general point: There is an effective proce- 
dure for computing the multiplication table of the transition monoid of a complete 
deterministic finite automaton. We enumerate the maps /,„ until we find that all 
words of some length induce the same maps as shorter words. 

Example 1.23. Consider the automaton A2 in Figure [T.12l The transition monoid 
is generated by the two permutations f a and fb, both of which are permutations of 
the set of states: f a cycles the five states and fb transposes a pair of adjacent states. 
It is well known from elementary group theory that we can obtain all transpositions 
t of adjacent elements by repeated conjugation with the cycle (the map 1 1-> f*t f a ), 
and that all permutations of the states can be obtained by composing transposi- 
tions of pairs of adjacent elements. So M(A) consists of all the permutations of 
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{1,2,3,4,5}, and is consequently the symmetric group of degree 5, with 5! = 120 
elements. Of course, we can do likewise with any finite set of states. 

Example 1.24. Now consider the effect of adding a third input letter to the 
preceding example, obtaining the automaton A3 in Figure 11.121 It is not hard to 
show that every map from {1,2,3,4,5} into itself can be obtained by repeatedly 
composing f c with permutations. Thus M(A&) is the full transformation monoid 
on 5 states, which has 5 5 = 3125 elements. We can similarly generate a transition 
monoid with n" elements using an n-state automaton. 

1.6.5. The syntactic monoid 

Now let L C A*, and consider the transition monoid of the minimal automaton 
Amin(L) = (Ql, 5L,tL: Fl)- Let u, v G A*. When are the two elements /„, f v of 
this monoid the same? If they are different, then there is some state q such that 
qfu 7^ qfv Since the automaton is minimal, there is a word y G A* distinguishing 
these two states, so that qf u f y G Ft, and qf v f y ^ Ft,, or vice- versa. Since every 
state is accessible, there is also a word x such that q = if x , so that either xuy G L 
and xvy (£_ L, or vice-versa. Conversely, if such a pair of words x, y exists, then /„ 
and f v cannot be equal. We thus have: 



Theorem 1.5. Let L C A*, and let u, v G A*. Let A = A m i n (L). Then f£ = ff 
if and only if for all x, y G A* 



If the conditions in this theorem are satisfied, then we write u =l v. The 
equivalence relation =l is called the syntactic congruence of L, and the transition 
monoid of Am\ n {L) is called the syntactic monoid of L. We denote the syntactic 
monoid of L by M(L). In algebraic terms, M(L) = M(A m i n (L)) = A*/= L , that 
is M(L) is the quotient monoid of A* by the syntactic congruence. The morphism 
mapping each w G A* to its =£-class is called the syntactic morphism of L, and is 
denoted hl- 

The syntactic congruence is a two-sided congruence on A*; that is, if u =l v 
and v! =l v', then uu' =l vv'. Compare this to the Myhill-Nerode congruence =l, 
which, as we noted, is a right congruence. The equivalence =l refines =£. 

Transition monoids, and, in particular, the syntactic monoid, allow us to place 
many questions about the behavior of automata in a purely algebraic setting. For 
instance, we have the following algebraic characterization of rationality: Let M be a 
monoid and (p : A* — > M a morphism. We say that ip recognizes L C A* if and only 
if there is a subset X of A* such that L — ^^-(X). We also say in this situation 
that M recognizes L. 

Theorem 1.6. Let L C A* . The following are equivalent: 
(1) L is rational. 
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(2) M(L) is finite. 

(3) L is recognized by a finite monoid. 

Proof. To show (1) implies (2), note that if L is rational, then .4 m i n (L) has a hnitc 
set of states, and thus its transition monoid, M(L), is finite. For (2) implies (3), 
if u £ L and u =l v, then v — eve is also in L. Thus L is a union of equivalence 
classes of =l, so that L = fj,^ (X), where X = {f w £ M(A m i n (L)) \ w £ L}. 
Finally, to show (3) implies (1) 1 suppose ip: A* — >• M, where M is finite, and 
that L = (p^ 1 (X). Then L is accepted by the complete deterministic automaton 
A(M) = (M, S, 1 , X), where for m £ M and a e A, 



Remark 1.7. Observe that if M is a finite monoid and A(M) is the automaton 
defined in the proof of Theorem ll.6[ then the transition monoid of A(M) is M itself. 

The syntactic monoid plays the same role in this algebraic view of rational 
languages that the minimal automaton plays in the automaton-theoretic view. Here 
we make this precise: We say that a monoid N divides a monoid M, and write 
N -< M, if there is a submonoid M' of M and a surjective morphism ip: M' — > N. 
It is easy to see that -< is a transitive relation on monoids. 

Theorem 1.7. Let L C A*. Then M recognizes L if and only if M(L) -< M. 

Proof. First suppose M recognizes L, so that there is a morphism ip: A* — > M 
such that L = (p~ 1 (X) for some X C M. We claim that if (fi(u) = (fi(v), then 
u =l v. To see this, suppose that xuy £ L for some x, y € A*. Then ip{xuy) G X, 
and since <p(u) = f(v), we have (p{xvy) £ X, so that xvy £ i. By the same 
argument, if xvy £ L then xuj/ £ L, so that u =£ v. 

Now let M' = tp(A*). We define a map ip: M' -> M (L) by ip((f(u)) = ^l{u). 
By the remark just made, ■0 is well-defined, since the value of ip only depends on 
ip{u) and not on w. Moreover ip is clearly a morphism, and it is surjective because 
[i L is, so M(L) -< M. 

Conversely, suppose M{L) -< M, so that there is a morphism from a sub- 
monoid M' of M onto M(L). For a £ A, we set </?(a) to be any m £ M' for which 
ip(m) = yUi(a). This can be extended to a unique morphism tp: A* — > M such that 
Mi =ipo<p. Let X = tp(L). If <^(it) £ X then ^(u) = tp(v) for some u £ L, and thus 
I^l(u) = /J-l(v), so u =l v. Since, as noted in the proof of Theorem lI.6l L is a union 
of =£-classes, this implies u £ L, so that i = (p^ 1 (X), and thus M recognizes L.O 

Example 1.25. Consider the transition monoid of the automaton A in Figure [T.f 31 
We can fairly easily determine the elements of this monoid without doing an ex- 
haustive tabulation: First, if a word w has even length, then it maps {1,3} into 
{1,3}, and {2,4} into {2,4}, while if w has odd length, then it interchanges these 



(5(m, a) 



m <p{a). 



Since M is finite, L is rational. 



□ 
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a, b 

Fig. 1.13. An automaton, whose transition monoid contains a non-trivial group 

two sets. Second, if w contains aa or bb as a factor, then the image of f w is con- 
tained in {3, 4}. Finally, if the letters of w alternate, then f w maps either 1 or 2 to 
{1,2}, but not both, depending on whether the first letter of w is a or b. We thus 
get these elements: 

1 = A = (1 2 3 4) 7 = / a = (2 3 4 3) 5 = / b = (4143) 

7^ = /ab = (1 4 3 4) f 5 7 = / 6a = (3 2 3 4) 

7 2 -/aa-(3 4 3 4) 7 3 = / aQQ = (4 3 4 3) 

Observe that {j 2 , 7 3 } forms a group, permuting the states 3 and 4. This automaton 
accepts the language (ab)*. In algebraic terms, the morphism ip: w H> f w from A* 
into M(A) recognizes this language with (ab)* = p^ 1 (X), where 

X = {/ 6 M(A) | / maps state 1 to itself }. 

The states 3 and 4 are equivalent, and the minimal automaton of L is obtained 
by merging these states: it is the automaton examined in Example 1 1 . 2 2 1 (with 1 as 
initial and final state), where we computed its transition monoid, namely AI(L). 
According to Theorem 11.71 M(L) -< M(A), and, indeed, the map sending 1 to 1, 
7,(5,7^,^7 to a, /?, a/3, /3a, respectively, and j 2 and 7 3 both to 0, is a morphism 
from M(A) onto M (L). 

Example 1.26. Let us take the automaton of Example II. 2 4 1 and specify 1 as both 
the initial state and the unique accepting state. With these choices, the automaton 
is the minimal automaton of the language it accepts, since every state is accessible 
and no two distinct states are equivalent. This shows that the syntactic monoid of 
a language accepted by an n-state automaton can have as many as n n elements. 

Example 1.27. Not every finite monoid is the syntactic monoid of a rational lan- 
guage. Consider, for instance, the monoid M = {l,a,0, 7 } with multiplication 
m\mi = 777-2 for 777,2 7^ 1- Suppose A is a finite alphabet and ip: A* — > M is a 
morphism. Let X C M. We partition A into three subsets, B, C, and D, 

B = {ae A \ ip(a) = 1} 

C = {ae A\tp(a) E X\{1}} 

D = A\(BUC) 
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Then v~ l {X) = B* U A*CB* if 1 e I and ^(X) = A*CB* otherwise. 
(Observe that B or C might be empty.) But then L — ip^ 1 (X) is recognized by the 
submonoid {l,a, /?}, using the morphism that maps B to 1, C to a and D to /3. 
Thus every language recognized by M is recognized by a strictly smaller monoid, 
so by Theorem II .71 M cannot be the syntactic monoid of any language. 

1.7. First-order definable languages 

This section is devoted to proving one of the earliest and most important appli- 
cations of the syntactic monoid: the characterization of the languages definable in 
FO«). 

A finite monoid M can contain a nontrivial group, as for example the group 
{7 2 ,7 3 } in the monoid M(A) of Example 11.251 If there is no nontrivial group in 
M, we say that M is aperiodic. 

Lemma 1.2. Let M be a finite monoid. Then the following are equivalent: 

(1) M is aperiodic. 

(2) There is an integer n > such that m n = m n+1 for all m £ M. 

Proof. Suppose M is aperiodic. Let m G M, and consider the sequence 
1, m, m 2 , . . . Since M is finite, if we take n — \M\, we have m r = m n for some 
r < n. Take the largest such r, and consider the set G — {m k r < k < n). 
Observe that for all g £ G, gG = Gg = G, since 

m r+t m s — m r+ ^ t+s ^ mod (™~ r )l 

for all s,t > 0. This implies that G is a group, so that \G\ — 1, and thus r = n — 1 
and m r = m r+1 . Conversely, if M is not aperiodic, then M contains a nontrivial 
group G, and an element g G G different from the identity element e of G. Then 
g k = e for some k > 1, so that g n ^ g n+1 for all n > 0. □ 

Note that the proof shows that we can choose n in condition (2) of Lemma [L2l 
to be \M\ - 1. 

We say that a language L C A* is star-free if it can be defined by an extended 
rational expression without the use of the * operation or morphic images. The 
Schutzenberger-McNaughton-Papert Theorem offers the following characterization. 

Theorem 1.8. Let L C A* be a rational language. Then the following are equiva- 
lent. 

(1) L is star-free. 

(2) L is definable by a sentence of FO(<). 

(3) L is recognized by an aperiodic finite monoid. 

(4) M(L) is aperiodic. 



September 22, 2011 4:33 World Scientific Review Volume - 9.75in x 6.5in chapl 



An Introduction to Finite Automata 39 

Before we turn to the proof of this theorem, we give an important corollary, and 
an example. 

Corollary 1.3. It is decidable whether a rational language (given by a rational 
expression or an accepting automaton) is definable by a sentence of first- order logic. 

Proof. As we have seen, we can compute *4 m i n (£) from any automaton or ex- 
pression for L, and thence compute the multiplication table of M — M(L). We 
can then test for all m € M whether m) 1 ^ 1 ^ 1 = m' M ', and thus, by Lemma [1.21 
determine whether M(L) is aperiodic. By Theorem 11.81 this decides whether L is 
first-order definable. □ 

In fact, the proof of Theorem 11.81 will show that if M(L) is aperiodic, then we 
can effectively construct both a star-free expression and a first-order sentence for L 
from an automaton that recognizes L. 

Example 1.28. Let L = (ab)* . We computed M(L) in Example [T22l We have 
a 2 = (3 2 = = a 3 = /3 3 , and (afJ) 2 = a/3, (/3a) 2 = /3a, so by Lemma [L2j M(L) 
is aperiodic. Theorem 11.81 savs that L is definable by a star-free extended rational 
expression, and also by a sentence of FO(<). Let us exhibit such expressions. 

First, note that membership of a word w in L is equivalent to saying that w 
contains no occurrence of either aa or bb as a factor, and that the first letter of w 
(if there is one) is a, and the last letter is b. We thus have 

L = {e} U (aA* n A*b n A*(aa U bb)A*). 

witnessing the fact that L is star- free (note that A* is star-free, since A* =0). 

To obtain a first-order sentence defining L, we use the same characterization of 
words in L. We say there is no occurrence of aa as a factor using the following 
sentence: 

~^3x3y(R a x A R a y A S(x, y)). 

This uses the successor predicate S, but as we noted earlier, S can be expressed in 
FO(<). We can likewise write a sentence saying that there is no occurrence of bb. 
An FO-sentence stating that the first letter of a word is a was given in Example ll.131 
A similar sentence can be formed to say that the last letter is b. Note that all these 
sentences are satisfied by the empty word as well, so that the conjunction of the 
four sentences defines the language (ab)* . 

This language is also recognized by the first monoid that we exhibited in Ex- 
ample [L25l which is not aperiodic. This in no way contradicts Theorem 1 1.81 which 
only says that some aperiodic monoid recognizes L. 

Remark 1.8. The decision procedure outlined in the proof of Corollary 11.31 may 
take exponential time in the size of an automaton accepting L, since it involves 
computing the syntactic monoid of L (see Example 1 1 . 24() . While this procedure 
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may be improved, this decision problem is intrinsically difficult. In fact, it is known 
to be PSPACE-complete (Cho and Huynh [2]). 

We now turn to the proof of Theorem fOl We will show (4) <^> (3) => (1) => 
(2) (4). By Theorem 1 1.71 every language is recognized by its syntactic monoid. 
Also every divisor of an aperiodic monoid is aperiodic, since the property m n — 
m n+l for all elements m in a monoid is inherited by morphic images and submonoids. 
Thus (3) and (4) are equivalent. 

The most difficult part of the proof is (3) => (1). To prove this, we suppose 
L C A* is recognized by a finite aperiodic monoid. This is equivalent to L being 
accepted by a complete deterministic automaton A — (Q,5,i,F) whose transition 
monoid is aperiodic (see the proof of Theorem 11.61 and Remark ll.7|) . We will show 
that for all q,q' G Q, the set , = {w | qf£ = q'} is a star-free language. Since 
L is a finite union of such languages, L is star-free. 

The proof is by induction on the pair (|Q|, \A\): the induction hypothesis is that 
the claim holds for all automata with a strictly smaller state set, or with the same 
size state set and a strictly smaller input alphabet. In the case \Q\ — 1, L is either 
A* or 0, which are star-free. In the case \A\ — 1, so that A = {a}, aperiodicity 
implies that L is a finite union of singleton sets {a }, possibly together with the 
language a r a*, where r = \Q\ — 1, which is also star-free, since a* = 0. 

We thus assume both |Q| > 1 and \A\ > 1. First suppose that for every a G A, 
Qfa = Q> so triat f(t is a permutation of Q. Aperiodicity implies (/^) r = {f a A ) r+1 
for some r, and thus f£ is the identity map on Q. Consequently f£ is the identity 
map for all w G A*, and thus the claim holds trivially. We can therefore assume 
that there is some a G A such that 

Qfa = Q' £ Q- 

We now define two new automata B and C. Automaton B has state set Q and 
next-state function <5| „ „, where B = A \ ja). We need not define initial and final 
states for B, because we are only interested in the state transitions f®. Automaton 
C has state set Q' , input alphabet 

C = {(f®,a)\weB*,aeA} 7 

and next-state function 

S': ( q ,(fla))^q-f£ a . 

This makes sense, because Qf^ = Q' and because f® = f®, implies that f£ a = f£, a . 
The inductive hypothesis applies to both B and C. (The transition monoids of 
these automata inherit the aperiodicity of A, because every transition in them is 
the restriction of a transition in A.) 

A word in , can contain either no occurrences of a, a single occurrence of a, 
or two or more occurrences of a. We can accordingly write , as a finite union 
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of sets of the form 

tB tB nT B tB n rp T B 

1,<l' ' ^q,p al ^p',q' ' ^q,p al P' \l" ^ q" ,q' ' 

where p e Q, p' = p ■ f£ G Q' , and T v . q » = L^ q „ n A* a. 

By the inductive hypothesis all the sets of the form Lf t are star-free, so it 
remains to show that T p i iq u is a star-free language. We can factor any w G A*a 
uniquely as 

w = v\a ■ ■ ■ Vka, 
where vi,...,Vk G B* . Let us associate to w the word 

wc = ci • • • c fe G C*, 

where Cj = (f®. , a) G C. By the inductive hypothesis, the language £y 5 „ is star- 
free. So we need to show that if i? C C* is star-free, then &(R) — {w G A*a | G 
J?} is also star-free, since T p > q » — \&(Iy »). It is thus enough to show 

(i) If c G C, then *({c}) is star-free. 
(ii) If is star-free, then *(C* \ J?) is star-free, 

fraj If *(i? 2 ) are star-free, then U i? 2 ) is star-free, 

(wj If *(i? 2 ) arc star-free, then ^(RiR 2 ) is star-free. 

For (%), note that *({c}) = 5a, where S ~ {v E B* \ c — (/f,a)}. Since 5 
is a boolean combination of languages of the form £p p /, So is star- free. For the 
other assertions, we clearly have *(C* \ R) = A*a n (A* \ U i? 2 ) = 

U *(i? 2 ), and ^(R 1 R 2 ) = *(i?i)*(i? 2 ). This completes the proof that 

To prove (1) (2), we need to show that every star-free language is first-order 
definable. Since the singleton sets {a} for a G A are clearly first-order definable, and 
since the boolean operations are part of first-order logic, this reduces to showing 
that if £i,£ 2 C A* are first-order definable, then so is LiL 2 . To do this, we 
introduce the notion of relativizing a first-order sentence. Let ip be a sentence of 
FO(<) and x a variable symbol that does not occur in <p. We define a formula 
ip <x with one free variable with the following property: Let v be an interpretation 
mapping x to i G Dom(u), and let v be the prefix v of u with domain {0, . . . , i — 1}. 
Then u, v |= <p <2; if and only if v \= p. To construct p< x , we simply work from 
the outermost quantifier of p inward, replacing each quantified subformula By a by 
3y ((y < x) A a). We define ip >x and (£< x analogously. 

Now suppose ip, ip are first-order sentences defining L\ and L 2 , respectively. Let 
x be a variable symbol that does not occur in p or ip. We have ZiZ 2 defined by the 
sentence 

Bx (p< x A ip >x ) if e £ Li, 
Bx (p< x A ip >x ) V ^ if e Gii. 



September 22, 2011 4:33 World Scientific Review Volume - 9.75in x 6.5in chapl 



42 H. Straubing and P. Weil 

To prove (2) (4), we need to show that the syntactic monoid of every first- 
order definable language in A* is aperiodic. We will proceed as in Section ri.4.2l and 
treat a first-order formula with free variables contained in {xi, . . . , x p } as defining a 
language over the extended alphabet B p — A x {0, 1} P . We will show by induction 
on the quantifier depth that every first-order definable language L C B* in this 
extended sense has an aperiodic syntactic monoid. More precisely, we will show that 
for each such L there exists an integer q > such that for all v £ B*, v q =l v q+1 . 
By Lemma lf ,2[ this implies aperiodicity. 

First suppose L is defined by one of the atomic formulas x\ < X2 or R a x\. Let 
u,v,w E B*. If v has a letter with a 1 in one of its last p components, then neither 
uv 2 w nor uv 3 w can be in L, since only one letter of a word in L can have a 1 in a 
given component. If v has no such letter, then membership of uvw in L is witnessed 
by the relative positions and values of letters in u and w, so that uvw E L if and 
only if uv 2 w E L. Thus in all cases, we have uv 2 w £ L if and only if uv 3 w E L, so 
that v 2 =l v 3 . 

Now suppose the claim is true for L\ 1 L2 C B* defined by formulas (pi,(f2, 
and suppose L is defined by ipi V ip2- We have, by assumption, v q v q+1 , and 
v q =l 2 v q+1 , for some q > 0. (The exponents for these two languages are, a priori, 
different, but we can then choose q to be the maximum of the two exponents.) 
Now f i V tp 2 defines L\ U L2, and we have directly uv q w £ L\ U L2 if and only if 
uv q+1 v £ Li UL 2 . 

Care must be taken with the negation operator, since it does not exactly cor- 
respond to the boolean complement. We can assume that the exponent q for L\ 
is at least 2. Let L[ be the language defined by -k/?i. Suppose uv q w £ L[. Then 
uv q w ^ Li, and thus uv q+1 w (fc L\. Further v cannot contain a 1 in the last p 
components of any of its positions, so uv q+1 w has exactly one occurrence of 1 in 
each of the last p positions, and thus is in L[. The same argument shows that if 
uv q+1 w £ L[, then so is uv q w. Thus v q —l{ v q+1 . 

So now let K C B* x be the language defined by 3x p <p\. Let v £ B*_ v We 
will show v 2q+1 =k v 2q+2 . Suppose uv 2q+1 w £ K . Let us extend each letter in this 
word by adding a p th component with 0. We will still denote the resulting word as 
uv 2q+1 w. Since K is defined by 3x p f\, we can switch the p th component of some 
letter to obtain a word z £ B* such that z £ L\. Now, wherever the position in 
which we switched the p th component is located, at least q consecutive occurrences 
of v will be left intact. We thus find that z can be written in the form xv q y, for some 
x, y £ B*. (The extreme case is when the position is within the middle occurrence 
of v, in which case we get two factors of the form v q .) Thus xv q+1 y £ L%. If we 
now switch the changed 1 back to 0, we find uv 2q+2 w £ K. The identical argument 
shows uv 2q+2 w £ K implies uv 2q+1 w £ K. Thus v 2q+1 =k v 2q+2 , as claimed. 

Remark 1.9. Interesting presentations of proofs of all or part of Theorem II .81 can 
be found, for instance, in the work of Perrin [15] . Straubing [21] and Diekert and 
Gastin [3]. 
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